MSc Data Science

DSM150 Neural Networks Coureswork (DL Workflow)

Covid-19 Detection using Convolutional Neural Network (CNN)

Section One:

1: Introduction

The coronavirus disease (COVID-19) is a global pandemic disease caused by the SARS-CoV-2 virus. Globally, over 400 million confirmed cases since the first reported case in November 2019, while the mortality exceeded 5 million death tolls. The Covid-19 is an infectious disease that spreads fast from an infected person through their mouth and nose. Most of the cases are mild to moderate; however, people who developed severe symptoms or death are older or have medical issues.

The main diagnosis methods (approaches) to detect the virus are Reverse transcription-polymerase chain reaction (RT-PCR), Computed Tomography (CT) and Chest X-ray, in addition to the pneumonia symptoms. Medical centres hold an enormous amount of images database that have been collected through X-ray and CT. Therefore, the fast and early detection of the disease can support in controlling the COVID-19 spread.

On the other hand, computer vision is a trending concept that plays a significant role in daily life, from online shopping to assisting doctors and medical providers in diagnosing. Computer vision algorithms benefit from the Convolutional neural network (CNN), a type of deep learning that contributes to automating the image classification, which plays a role in early detecting the disease. The network inputs are images, while the output is to classify image features. Many companies such as Google, Microsoft and many others benefit from using CNN and working toward novel designs.

1.1 Literature review:

The significant impact of the COVID-19 pandemic worldwide, specifically in health sectors, affects people's health which appears in mortality and morbidity records. Therefore, COVID-19 attracts the interest and attention of many researchers and publishers. Below are two studies that used X-ray images in the classification of COVID-19 using CNN.

Asmaa Abbas and others in 2021 classified COVID-19 based on X-ray images by adapting a CNN architecture called Decompose, Transfer, and Compose (DeTraC). The main goal was to overcome the irregularities issue in annotated data. DeTraC was validated with various pre-trained models in ImageNet such as VGG19, ResNet, GoogleNet, etc. As a result, it achieved significantly high accuracy in detecting COVID-19 from X-ray images with VGG19 around 98%. In contrast, DeTraC's accuracy was 93.1% with 100% sensitivity.

Another study was conducted to automate the detection of COVID-19 from X-ray images using CNN. This paper aimed to distinguish the COVID-19 patients from healthy, viral and bacterial pneumonia. The deep transfer learning approach was utilized with nine different pre-trained models, such as GoogleNet, ResNet-50, Se-ResNeXt-50, Inception V4, etc. The Se-ResNeXt-50 achieved the highest classification accuracy for binary class and multi-class with 99.3% and 97.6%, respectively.

1.2 Problem Definition

The analysis has been conducted with an attempt to detect covid-19 cases based on the extraction of Covid-19 graphical features to classify the Chest X-ray images. This problem is a single label binary classification (normal/covid) and will be detailed further when describing the dataset.

The main objective of this coursework is to enhance a model that is more accurate than the baseline model. More specifically, building a network that will be trained on a portion of the dataset (training set), validated on the validation set, then used the tuned hyperparameters with the optimum epoch to train on the whole training set and the model accordingly will be able to predict unseen dataset (test set).

1.3 Hypotheses

Below hypotheses are considered while developing the NN model:

1.4 Document Structure

The notebook will consist of six main sections as follows. The 1st section focuses on topic introduction and model architecture & methodology. The 2nd section covers the preprocessing and modelling (from scratch) phases from baseline to enhanced model. The 3rd section applies a pre-trained convent model. Then the following section utilises the previous enhanced model on the unseen dataset (test data). Following that, general convolutional visualisations are illustrated. Finally, the last section covers the results, conclusions, references and appendix.

2: Methodology and Model Architecture

2.1 Methodology

A Convolution Neural Network (CNN) is applied to the dataset to classify the x-ray images of a particular patient into normal or covid.

The first step was to import the necessary libraries used throughout the modelling phases. Following that, Covid and normal x-ray images were imported, and a balanced sample dataset was taken from the original dataset to ease the modelling process with a small dataset. This sample dataset for each class (Covid/ normal) was split into train, validation, and test sets equally each consists of 1000, 500 and 500 samples.

In the next step, pre-processing, rescaling and resizing for the images were done, which helped save time during the training process. The images were rescaled from 0 to 299 pixels into the scale between 0 and 1. In addition, image-resizing was applied from 299 by 299 to the target size 150 by 150 pixels.

A baseline model was initialized as a benchmark to be beaten by the final and enhanced model. Then a statistical power model was expanded and trained gradually by increasing the batch size, neurons and filters until it overfitted – resulted in a model with high accuracy but failed to be generalized on a new dataset, and an optimum epoch was generated from the overfitted model.

Many methods were applied to mitigate the overfitting, such as using callbacks to early stop the model when there is no improvement. Moreover, reducing the network size, data augmentation, adding dropout, and weight regularization was applied and compared to the overfitted model. Furthermore, hyperparameters tuning was applied using various techniques by tuning the batch size, number of filters, activation function, optimizer, filter window size, number of layers, and padding.

Nevertheless, depthwise separable convolution and batch normalization were deployed as advanced techniques. Then a pre-trained MiniVGGNet model for grayscale images was implemented using different approaches. Finally, the model was adjusted using the best-tunned parameters with an optimum epoch, trained in the training dataset and then tested on the test set and accuracies for baseline, trained and predicted model were compared, then prediction was generated on the final model.

2.2 Model architecture

Below are the factors that have been considered when building and tunning the neural networks model:

  1. Activation Function: based on the classification problem, which is here binary classification, a sigmoid activation function was deployed in the last dense layer to output the probability distribution for the 2 output classes, which sums up to one. On the other hand, relu was the main activation in the remaining layers and then it compared to the old activation function tanh.
  2. Loss function: based on the classification problem, a suitable activation function used during modelling was binary_crossentropy.
  3. Accuracy metrics: based on the given dataset is class-balanced, so the probability of getting one of the classes is the same.
  4. Optimizer: rmsprop and adam were used throughout the modelling by starting the modelling with rmsprop as a default value.
  5. The other model parameters: various parameters tuning were applied until the optimized configuration is achieved. For example, the number of filters and filter window size with a default value (3,3) and can be increased to (5,5).

Section Two:

Data Overview, Pre-processing and Modeling

1. Importing Libraries

2. Dataset Description

The X-ray images database was obtained from Kaggle called COVID-19 Radiography Database. A team of researchers developed this database in addition to other collaborators. It was collected from various sources such as the Italian Society of Medical and Interventional Radiology (SIRM) COVID-19 Database, over 40 publications, Chest X-Ray Images (pneumonia) database and Novel Corona Virus 2019 Dataset.

The dataset is publicly available to be used for academic purposes and can be accessed from the following link (here)

The database consists of four classes COVID, Lung_Opacity, Normal and Viral Pneumonia. Each consists of 3616, 6012, 10.2k and 1345 files. The COVID and Normal files were used in this coursework, with just 2000 images in each category. Then to obtain balanced classes, both Normal and COVID were split into 1000 train, 500 validation and 500 test sets.

2.1 Loading the Datasets

The below code was run once when the file loaded, then the following active code will be run instead to read the datasets

2.2 Dataset Overview

3. Data Preprocessing

3.1 Rescaling and Resizing

Rescale the pixel values (between 0 and 255) to the [0, 1] interval (as you know, neural networks prefer to deal with small input values), also set the target size to 150 by 150. Resizing and scaling were applied on images to save time in training.

ImageDataGenerator was used to read images from directories

3.2 Defining the Model

3.3 Defining the Evaluation Function

3.4 Defining the plotting function

4. Baseline Model

The baseline model is used to set as a benchmark model in regard to the other developed models. The main target is to develop a model that can beat the baseline model.

There are two approaches to initialize a bseline mode, either by common sense or by using basic machine learning. To choose the best one, it will depends on the task and if the dataset is balanced or not. In this case the two approaches will be applied and disscused to see the suitable one.

4.1 Common-sense, Non-machine Learning Baseline Model

After downloading the dataset, a balanced subset was created from the original one.

Each set (train, validation and test) was divided equally to normal and covid. This balance will lead to get an accurate a common sense baseline.

This is a balanced binary classification problem

with a common sense baseline prediction accuracy of 50%

4.2 Basic Machine Learning Baseline Approach

A simple baseline model was designed with one hidden layer and a sigmoid function was used throughout the modelling process (as mentioned in the model architecture). The basic baseline model was trained using train set and validated in the validation set and then it tested using the test set (unseen data)

The number of channels which is mentioned as a parameter in the input_shape is equal 1, that means the used x-ray images is black and white (grayscale)

The baseline model accuracy for the both approaches is 51%. The following phase is to build a model that beats the baseline model.

5. Develop a Statistical Power Model (Training From Scratch)

Here, a small and underfitted model is built to beat the defined baseline, this model will be gradually enhanced untill it is overfitted

5.1 Model 1: Building a Small Model that Beats the Baseline

Generating a basic model that has a higher accuracy than the baseline model by adding convolutional layers

5.1.1 Evaluate the Model Performance

5.1.2 Plotting Training and Validation Loss and Accuracy

5.1.3 Comparison Between the Baseline and Model 1

5.2 Model 2: Increasing the Model capability

5.2.1 Evaluate the Model Performance

5.2.2 Plotting Training and Validation Loss and Accuracy

5.2.3 Comparison Between the Baseline, Model 1 and Model 2

5.3 Model 3: Building a Powerful and Overfitted Model

5.3.1 Plotting Training and Validation Loss and Accuracy

5.3.2 Comparison Between the Baseline, Model 1, Model 2 and Model 3

6. Optimum Epoch

Following the generation of the overfitted model. The manual approach was adopted to get the optimum epoch (cut off) point where there is no any improvement in the model by looking at the validation loss graph as stated below in the code.

This optimum epoch will be used at the end after improving the model by using regularisation and tunning the hyperparameters. Therefore, it will be applied when training all the dataset on unseen data (test data) .

7. Keras Callbacks

Keras callbackks will handle the optimum epoch by interrupting the training once there is no improving

Keras callbacks will be tested here and compared with the manual technique and the epoch number where the training stopped was used in regularization and hyperparameters tuning.

To be able to compare my results with the change in the number of epoch at each use of callbacks, thus I will fix the this callbacks results. When training all the data set on the test data, the callbacks will be used again.

7.1 Re-defining Evaluation Model Performance with Callbacks Parameter

7.2 Evaluate the Model Performance

7.3 Plotting Training and Validation Loss and Accuracy

The optimum epoch when callbacks method was applied is 35

7.4 Comapring the Optimum Epoch using Manual and Callbacks Approaches

8. Overfitting Techniques

8.1 First Technique: Reducing the Network Size

8.1.1 Model 1: Evaluation With 32 Neurons

8.1.1.1 Plotting Validation Loss Model With 32 Neurons and the Original Neuron

8.1.2 Model 2: Evaluation With 64 Neurons

8.1.2.1 Plotting Validation Loss Model With 64 Neurons and the Original Neuron Size

8.1.3 Model 3: Evaluation With 128 Neurons

8.1.3.1 Plotting Validation Loss Model With 128 Neurons and the Original Neuron Size

8.1.4 Comparing Between Values With and Without Changing in Network Size

8.1.4.1 Plotting Validation Loss of Different Network Size (32, 64, 128, 256)

8.2 Second Technique: Data Augmentation

8.2.1 Evaluate the Model Performance

8.2.2 Plotting Training and Validation Loss and Accuracy

8.2.3 Plotting Validation Loss of Data Augmentation in Comparison to Overfitted Model

8.2.4 Comparing Between Models With and Without Data Augmentation

8.3 Third Technique: Adding Dropout

8.3.1 First Approach: Adding Dropout only in the Dense Layer

The model was redefined by adding one dropout in the dense layer

8.3.1.1 Re-defining the Model Architecture

8.3.1.2 Model 1: Evaluation With 0.1 Dropout

8.3.1.2.1 Plotting Training and Validation Loss and Accuracy
8.3.1.2.2 Plotting Validation Loss of 0.1 Dropout in Comparison to Augmented Model

8.3.1.3 Model 2: Evaluation With 0.3 Dropout

8.3.1.3.1 Plotting Training and Validation Loss and Accuracy
8.3.1.3.2 Plotting Validation Loss of 0.3 Dropout in Comparison to Augmented Model

8.3.1.4 Model 3: Evaluation With 0.5 Dropout

8.3.1.4.1 Plotting Training and Validation Loss and Accuracy
8.3.1.4.2 Plotting Validation Loss of 0.5 Dropout in Comparison to Augmented Model

8.3.2 Second Approach: Adding Dropout After Each Layer

8.3.2.1 Re-defining the Model Architecture

8.3.2.2 Model Evaluation with Dropout After Each Layer

8.3.2.3 Plotting Training and Validation Loss and Accuracy

8.3.2.4 Plotting Validation Loss with Multi-Dropout in Comparison to Augmented Model

8.3.2.5 Comparing Between Models With and Without Dropout

although the best value was for the model with 0.1 dropout, but the model without dropout outperformed all the approaches when compared using the line graph.

This implied the augmented model will be utilized in the next step.

8.4 Fourth Technique: Adding Weight Regularization¶

8.4.1 First Approach: Adding Regularizer in all Layers

8.4.1.1 Model 1: Evaluation of L1 Regularizer

8.4.1.1.1 Plotting Training and Validation Loss and Accuracy

8.4.1.2 Model 2: Evaluation of L2 Regularizer

8.4.1.2.1 Plotting Training and Validation Loss and Accuracy

8.4.1.3 Model 3: Evaluation With L1_L2 Regularizers

8.4.1.3.1 Plotting Training and Validation Loss and Accuracy

8.4.1.4 Plotting Validation Loss of L2 Regularizer in Comparison to Data Augmentation Model

8.4.2 Second Approach: Adding Only One Regularizer in the Dense Layer

8.4.2.1 Model 1: Evaluation of L1 Regularizer

8.4.2.1.1 Plotting Training and Validation Loss and Accuracy

8.4.2.2 Model 2: Evaluation of L2 Regularizer

8.4.2.2.1 Plotting Training and Validation Loss and Accuracy

8.4.2.3 Model 3: Evaluation of L1-L2 Regularizers

8.4.2.3.1 Plotting Training and Validation Loss and Accuracy

Comparing Between L2 Regularizer (Single Layer) and Data Augmented Model

Comparing Between Models With and Without Regulaarizers

9. Hyperparameters Tuning

9.1 Batch Size

9.1.1 Model 1: Evaluation With Batch Size 64

9.1.1.1 Plotting Validation Loss Model With 64 and 20 Batch Size

9.1.2 Model 2: Evaluation With Batch Size 128

9.1.2.1 Plotting Validation Loss Model With 128 and 20 Batch Size

9.1.3 Model 3: Evaluation With Batch Size 256

9.1.3.1 Plotting Validation Loss Model With 256 and 20 Batch Size

9.1.4 Comparing Between Values With and Without Changing in Batch Size

9.1.4.1 Plotting Validation Loss of 20, 64, 128 and 256

9.2 Filter Size (Kernel Size)

9.2.1 Model 1: Small Filter Size

9.2.1.1 Plotting Validation Loss Model With Small and Original Filter Size

9.2.2 Model 2: Increasing Filter Size

9.2.2.1 Plotting Validation Loss Model With Small and Original Filter Size

9.2.3 Comparing Between Values With and Without Changing in Filter Size

9.2.3.1 Plotting Validation Loss of Small, Large and Original Filter Size

9.3 Activation Function

9.3.1 Evaluation With tanh Activation Function

9.3.1.1 Comparing Between Values With and Without Changing in Learning Rate

9.3.1.2 Plotting Validation Loss With relu and tanh Activation Function

9.4 Model Optimizer

9.4.1 Evaluation With adam Optimizer

9.4.2 Comparing Between Values With and Without Changing in Model Optimizer

9.4.2.1 Plotting Validation Loss With relu and tanh Activation Function

adam will be used as it outperformed the rmsprop

9.5 Filter Window Size

9.5.1 Evaluation With (5,5) Filter Window Size

9.5.2 Comparing Between Values With and Without Changing in Filter Window Size

9.5.2.1 Plotting Validation Loss with 3x3 and 5x5 Window Size

9.6 Number of Layers

9.6.1 Increasing the Number of Layers

9.6.1.1 Evaluate the Model Performance

9.6.1.2 Comparing Between Values With and Without Adding Layers

9.6.1.3 Plotting Loss Validation With and Without Adding Layers

9.7 With Padding

Padding technique will be used so the spatial dimensions for the output feature map will be the same as for the input. That will be done by adding more rows and columns.

9.7.1 Add Padding Parameter to the Model Architecture

9.7.1.1 Evaluate the Model Performance

9.7.1.2 Comparing Between Values With and Without Padding

9.7.1.3 Plotting Validation Loss Model With and Without Padding

Padding will be used in the next models as it shows a slight improvement when applied

10. Deployment of Advanced Best Practices

10.1 Depthwise Separable Convolution

10.1.1 Evaluate the Model Performance

10.1.2 Comparing Between Values With and Without Depthwise Separable Convolution

10.1.3 Plotting Validation Loss Model With and Without Depthwise

10.2 Batch Normalization

10.2.1 Evaluate the Model Performance

10.2.2 Comparing Between Values With and Without Batch Normalization

10.2.3 Plotting Validation Loss Model With and Without Deptthwise Separable Convolution

Section Three:

Using a Pretrained Convnet

1. Feature Extraction

The below model was reused from github account and it is referenced in the reference section

1.1 Option 1: Extract then Train

1.1.2 Fast Feature Extraction Without Data Augmentation

The extracted features are currently of shape (samples, 4, 4, 512)

Flatten to (samples, 8192) ready for input to a dense classifier

Define the densely-connected classifier (with dropout for regularisation)

and train it on the recorded data and labels:

1.2 Option 2: Extract and Train

1.2.1 Feature Extraction With Data Augmentation

Due to the lack of GPU, it was very difficult to run the above part

2. Fine-tuning

Section Four:

1. Training the Final Model on the Test Dataset

1.2 The TensorFlow Visualization Framework

1.3 Evaluate the Model on the Test Dataset

1.4 Comparison Between the Baseline, Trained and Final Model

Section Five:

1. Convnet Visualisation

1.2 Display X-rays

1.3 Visualizing Intermediate Activations

Following the book "Deep learning with Python". some visualization was attempting to find out how the fillters work such as detecting the image edges

Now extract and plot every channel in each of our 8 activation maps - stack the results in one big image tensor, with channels stacked side by side.

1.4 Visualising Convnet Filters

Let's visualise filter 0 in layer block3_conv1

Section Six:

1. Results

  1. The 1st is a small and underfitted model which achieved 92% accuracy.
  2. The 2nd model has an Increased capability by adding more layers and increasing the parameter values, and it achieved 94% accuracy
  3. The 3rd is a Powerful and Overfitted model which achieved 95%

Several overfitting techniques have experimented:

  1. Adding Dropout only in the Dense Layer (Evaluation With 0.1, 0.3 & 0.5 Dropout) which achieved 0.232, 0.233 & 0.244 validation loss respectively
  2. Adding Dropout after Each Layer which achieved 0.239 validation loss
  1. First Approach: Adding Regularizer in all Layers which achieved validation loss of 0.308
  2. Second Approach: Adding Only One Regularizer in the Dense Layer which achieved 0.323 validation loss

Next, Hyperparameters Tuning based on many factors

  1. Batch size 64, 128, 256 which achieved validation loss of 0.217, 0.243 & 0.222
  2. Filter Size (Kernel Size) compared small filter size vs increasing filter, and both achieved validation loss 0.23
  3. Activation Function with tanh activation Function which achieved 33.5% while relu activation which achieved 0.229 validation loss
  4. Model Optimizer with With adam & rmsprop Optimizer which achieved 0.225 & 0.229 validation loss
  5. Filter Window Size: tested Window Size 5x5 & 3x3, which achieved validation loss of 0.218 and 0.229
  6. Number of Layers: adding more layers resulted in a validation loss of 0.323, which is worst than the original layers 0.229
  7. With padding: which achieved better loss validation performance at 0.213 hence it will be used

Next, Deployment of Advanced Best Practices were implemented such as Depthwise Separable Convolution which – at 69.3% - did not improve the validation loss and Batch Normalization, which also did not provide any improvement at 0.405 validation loss Since the image dataset is relatively small, a pre-trained network is utilized in two ways.

  1. 1st is feature extraction, in which a representation from another network is used to extract features. The start is by building a convolutional base of the model from the previous network, executing the new image dataset through it and then training the new model (densely connected) based on the output At this point, there are two options. 1st is Fast Feature Extraction Without Data Augmentation, which utilize the ImageDataGenerator. However, this option does not use data augmentation, preventing overfitting with small image datasets. Hence, it tends to overfit very early. 2nd option fast feature extraction with data augmentation, which usually takes more time, and due to the lack of GPU on the used laptop, it was not possible to generate good results

  2. 2nd is fine-tuning, in which a few top layers of a frozen model are unfrozen to be used for feature extraction and then trained these top layers with the added part of the model.

2. Conclusions

3. References

  1. Dataset: COVID-19 Radiography Database | Kaggle. [online]. Available from: https://www.kaggle.com/datasets/tawsifurrahman/covid19-radiography-database [Accessed January 28, 2022].
  2. Rahman, T., Khandakar, A., Qiblawey, Y., Tahir, A., Kiranyaz, S., Kashem, S.B.A., Islam, M.T., Maadeed, S.A., Zughaier, S.M., Khan, M.S. and Chowdhury, M.E., 2020. Exploring the Effect of Image Enhancement Techniques on COVID-19 Detection using Chest X-ray Images
  3. M.E.H. Chowdhury, T. Rahman, A. Khandakar, R. Mazhar, M.A. Kadir, Z.B. Mahbub, K.R. Islam, M.S. Khan, A. Iqbal, N. Al-Emadi, M.B.I. Reaz, M. T. Islam, “Can AI help in screening Viral and COVID-19 pneumonia?” IEEE Access, Vol. 8, 2020, pp. 132665 – 132676
  4. Chollet, F. (2017), Deep Learning with Python , Manning
  5. COVID Live - Coronavirus Statistics - Worldometer. [online]. Available from: https://www.worldometers.info/coronavirus/ [Accessed Feb 03, 2022].
  6. Coronavirus. [online]. Available from: https://www.who.int/health-topics/coronavirus#tab=tab_1 [Accessed Feb 03, 2022].
  7. Penarrubia, L., Ruiz, M., Porco, R., Rao, S.N., Juanola-Falgarona, M., Manissero, D., et al.: Multiple assays in a real-time RT-PCR SARS-CoV-2 panel can mitigate the risk of loss of sensitivity by new genomic variants during the COVID-19 outbreak. Int. J. Infect. Dis. 97, 225–229 (2020)
  8. CNN Variants for Computer Vision: History, Architecture, Application, Challenges and Future Scope Dulari Bhatt 1 , Chirag Patel 2,*, Hardik Talsania 1 , Jigar Patel 1 , Rasmika Vaghela 1 , Sharnil Pandya 3 , Kirit Modi 4 and Hemant Ghayvat 5
  9. Zhang, YD., Satapathy, S.C., Liu, S. et al. A five-layer deep convolutional neural network with stochastic pooling for chest CT-based COVID-19 diagnosis. Machine Vision and Applications 32, 14 (2021). https://doi.org/10.1007/s00138-020-01128-8
  10. Khozeimeh, F., Sharifrazi, D., Izadi, N.H. et al. Combining a convolutional neural network with autoencoders to predict the survival chance of COVID-19 patients. Sci Rep 11, 15343 (2021). https://doi.org/10.1038/s41598-021-93543-8
  11. Abbas, A., Abdelsamea, M.M. & Gaber, M.M. Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network. Appl Intell 51, 854–864 (2021). https://doi.org/10.1007/s10489-020-01829-7
  12. Hira, S., Bai, A. & Hira, S. An automatic approach based on CNN architecture to detect Covid-19 disease from chest X-ray images. Appl Intell 51, 2864–2889 (2021). https://doi.org/10.1007/s10489-020-02010-w
  13. lisa-faster-R-CNN/minivggnet.py at master · agoila/lisa-faster-R-CNN. [online]. Available from: https://github.com/agoila/lisa-faster-R-CNN/blob/master/pyimagesearch/nn/conv/minivggnet.py [Accessed March 15, 2022].
  14. Practical Guide to Hyperparameters Optimization for Deep Learning Models. [online]. Available from: https://blog.floydhub.com/guide-to-hyperparameters-search-for-deep-learning-models/ [Accessed March 28, 2022].
  15. My previous Neural Networks Coursework "Prediction of Obesity Levels Based on Eating Habits and Physical Condition".

4. Appendix

This an automated tool that used when tuning hyperparameters. For example, when the number of filters, neurons and other parameters are unknown. Therefore instead of optimizing them manually, below is an automated way to deal with tuning. But for the coursework the manual method was prefered.